Noise Reduction and Content Retrieval from Web Pages

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Information Retrieval from Web Pages

In this paper, a new fast algorithm for information retrieval is presented. Such algorithm relies on performing cross correlation in the frequency domain between input data and the input weights of fast neural networks (FNNs). It is proved mathematically and practically that the number of computation steps required for the presented FNNs is less than that needed by conventional neural networks ...

متن کامل

Advanced Information Retrieval from Web Pages

A lightweight, web based with near to real-time speed algorithm is proposed in this work. It is able to retrieve main parts (menu, main text, header and footer) of a randomly selected web page entirely using CSS, JavaScript, frames, layers, images, etc. for retrieval. Moreover shortcomings of wellknown modern algorithms for content retrieval from web pages are discussed in this proposal. The al...

متن کامل

Main Content Extraction from Detailed Web Pages

As we know internet detailed web pages contains information which are not considered as primary content such as advertisements, headers, footers, navigation links and copyright information. Also information on web pages such as comments and reviews are not preferred by search engines to index as informative content, thereby having an algorithm to extracts only main content could help better qua...

متن کامل

Analyzing new features of infected web content in detection of malicious web pages

Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...

متن کامل

Eliminating the Noise from Web Pages using Page Replacement Algorithm

Data mining is the process of mining information from the large set of data. It further has many categories like text mining web usage mining and web content mining. There are many types of algorithm which are used in web mining i.e. Visitor method, Dom tree and least recent used algorithm. Visitor and Dom tree is the complex and time consuming method. Least Recent Used algorithm is less time c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Computer Applications

سال: 2013

ISSN: 0975-8887

DOI: 10.5120/12729-9573